154 research outputs found
SqueezeSeg: Convolutional Neural Nets with Recurrent CRF for Real-Time Road-Object Segmentation from 3D LiDAR Point Cloud
In this paper, we address semantic segmentation of road-objects from 3D LiDAR
point clouds. In particular, we wish to detect and categorize instances of
interest, such as cars, pedestrians and cyclists. We formulate this problem as
a point- wise classification problem, and propose an end-to-end pipeline called
SqueezeSeg based on convolutional neural networks (CNN): the CNN takes a
transformed LiDAR point cloud as input and directly outputs a point-wise label
map, which is then refined by a conditional random field (CRF) implemented as a
recurrent layer. Instance-level labels are then obtained by conventional
clustering algorithms. Our CNN model is trained on LiDAR point clouds from the
KITTI dataset, and our point-wise segmentation labels are derived from 3D
bounding boxes from KITTI. To obtain extra training data, we built a LiDAR
simulator into Grand Theft Auto V (GTA-V), a popular video game, to synthesize
large amounts of realistic training data. Our experiments show that SqueezeSeg
achieves high accuracy with astonishingly fast and stable runtime (8.7 ms per
frame), highly desirable for autonomous driving applications. Furthermore,
additionally training on synthesized data boosts validation accuracy on
real-world data. Our source code and synthesized data will be open-sourced
A LiDAR Point Cloud Generator: from a Virtual World to Autonomous Driving
3D LiDAR scanners are playing an increasingly important role in autonomous
driving as they can generate depth information of the environment. However,
creating large 3D LiDAR point cloud datasets with point-level labels requires a
significant amount of manual annotation. This jeopardizes the efficient
development of supervised deep learning algorithms which are often data-hungry.
We present a framework to rapidly create point clouds with accurate point-level
labels from a computer game. The framework supports data collection from both
auto-driving scenes and user-configured scenes. Point clouds from auto-driving
scenes can be used as training data for deep learning algorithms, while point
clouds from user-configured scenes can be used to systematically test the
vulnerability of a neural network, and use the falsifying examples to make the
neural network more robust through retraining. In addition, the scene images
can be captured simultaneously in order for sensor fusion tasks, with a method
proposed to do automatic calibration between the point clouds and captured
scene images. We show a significant improvement in accuracy (+9%) in point
cloud segmentation by augmenting the training dataset with the generated
synthesized data. Our experiments also show by testing and retraining the
network using point clouds from user-configured scenes, the weakness/blind
spots of the neural network can be fixed
Counterexample-Guided Data Augmentation
We present a novel framework for augmenting data sets for machine learning
based on counterexamples. Counterexamples are misclassified examples that have
important properties for retraining and improving the model. Key components of
our framework include a counterexample generator, which produces data items
that are misclassified by the model and error tables, a novel data structure
that stores information pertaining to misclassifications. Error tables can be
used to explain the model's vulnerabilities and are used to efficiently
generate counterexamples for augmentation. We show the efficacy of the proposed
framework by comparing it to classical augmentation techniques on a case study
of object detection in autonomous driving based on deep neural networks
Beating Backdoor Attack at Its Own Game
Deep neural networks (DNNs) are vulnerable to backdoor attack, which does not
affect the network's performance on clean data but would manipulate the network
behavior once a trigger pattern is added. Existing defense methods have greatly
reduced attack success rate, but their prediction accuracy on clean data still
lags behind a clean model by a large margin. Inspired by the stealthiness and
effectiveness of backdoor attack, we propose a simple but highly effective
defense framework which injects non-adversarial backdoors targeting poisoned
samples. Following the general steps in backdoor attack, we detect a small set
of suspected samples and then apply a poisoning strategy to them. The
non-adversarial backdoor, once triggered, suppresses the attacker's backdoor on
poisoned data, but has limited influence on clean data. The defense can be
carried out during data preprocessing, without any modification to the standard
end-to-end training pipeline. We conduct extensive experiments on multiple
benchmarks with different architectures and representative attacks. Results
demonstrate that our method achieves state-of-the-art defense effectiveness
with by far the lowest performance drop on clean data. Considering the
surprising defense ability displayed by our framework, we call for more
attention to utilizing backdoor for backdoor defense. Code is available at
https://github.com/damianliumin/non-adversarial_backdoor.Comment: Accepted to ICCV 202
CCSPNet-Joint: Efficient Joint Training Method for Traffic Sign Detection Under Extreme Conditions
Traffic sign detection is an important research direction in intelligent
driving. Unfortunately, existing methods often overlook extreme conditions such
as fog, rain, and motion blur. Moreover, the end-to-end training strategy for
image denoising and object detection models fails to utilize inter-model
information effectively. To address these issues, we propose CCSPNet, an
efficient feature extraction module based on Transformers and CNNs, which
effectively leverages contextual information, achieves faster inference speed
and provides stronger feature enhancement capabilities. Furthermore, we
establish the correlation between object detection and image denoising tasks
and propose a joint training model, CCSPNet-Joint, to improve data efficiency
and generalization. Finally, to validate our approach, we create the CCTSDB-AUG
dataset for traffic sign detection in extreme scenarios. Extensive experiments
have shown that CCSPNet achieves state-of-the-art performance in traffic sign
detection under extreme conditions. Compared to end-to-end methods,
CCSPNet-Joint achieves a 5.32% improvement in precision and an 18.09%
improvement in [email protected]
Shift: A Zero FLOP, Zero Parameter Alternative to Spatial Convolutions
Neural networks rely on convolutions to aggregate spatial information.
However, spatial convolutions are expensive in terms of model size and
computation, both of which grow quadratically with respect to kernel size. In
this paper, we present a parameter-free, FLOP-free "shift" operation as an
alternative to spatial convolutions. We fuse shifts and point-wise convolutions
to construct end-to-end trainable shift-based modules, with a hyperparameter
characterizing the tradeoff between accuracy and efficiency. To demonstrate the
operation's efficacy, we replace ResNet's 3x3 convolutions with shift-based
modules for improved CIFAR10 and CIFAR100 accuracy using 60% fewer parameters;
we additionally demonstrate the operation's resilience to parameter reduction
on ImageNet, outperforming ResNet family members. We finally show the shift
operation's applicability across domains, achieving strong performance with
fewer parameters on classification, face verification and style transfer.Comment: Source code will be released afterward
- …